Rio Arriba County
StepSearch: Igniting LLMs Search Ability via Step-Wise Proximal Policy Optimization
Wang, Ziliang, Zheng, Xuhui, An, Kang, Ouyang, Cijun, Cai, Jialu, Wang, Yuhang, Wu, Yichao
Efficient multi-hop reasoning requires Large Language Models (LLMs) based agents to acquire high-value external knowledge iteratively. Previous work has explored reinforcement learning (RL) to train LLMs to perform search-based document retrieval, achieving notable improvements in QA performance, but underperform on complex, multi-hop QA resulting from the sparse rewards from global signal only. To address this gap in existing research, we introduce StepSearch, a framework for search LLMs that trained with step-wise proximal policy optimization method. It consists of richer and more detailed intermediate search rewards and token-level process supervision based on information gain and redundancy penalties to better guide each search step. We constructed a fine-grained question-answering dataset containing sub-question-level search trajectories based on open source datasets through a set of data pipeline method. On standard multi-hop QA benchmarks, it significantly outperforms global-reward baselines, achieving 11.2% and 4.2% absolute improvements for 3B and 7B models over various search with RL baselines using only 19k training data, demonstrating the effectiveness of fine-grained, stepwise supervision in optimizing deep search LLMs. Our code will be released on https://github.com/Zillwang/StepSearch.
- Europe > Germany (0.14)
- Asia > Malaysia (0.14)
- Asia > Middle East > Kuwait (0.05)
- (13 more...)
Investigating the importance of social vulnerability in opioid-related mortality across the United States
Deas, Andrew, Spannaus, Adam, Maguire, Dakotah D., Trafton, Jodie, Kapadia, Anuj J., Maroulas, Vasileios
The opioid crisis remains a critical public health challenge in the United States. Despite national efforts which reduced opioid prescribing rates by nearly 45\% between 2011 and 2021, opioid overdose deaths more than tripled during this same period. Such alarming trends raise important questions about what underlying social factors may be driving opioid misuse. Using county-level data across the United States, this study begins with a preliminary data analysis of how the rates of thirteen social vulnerability index variables manifest in counties with both anomalously high and low mortality rates, identifying patterns that warrant further investigation. Building on these findings, we further investigate the importance of the thirteen SVI variables within a machine learning framework by employing two predictive models: XGBoost and a modified autoencoder. Both models take the thirteen SVI variables as input and predict county-level opioid-related mortality rates. This allows us to leverage two distinct feature importance metrics: information gain for XGBoost and a Shapley gradient explainer for the autoencoder. These metrics offer two unique insights into the most important SVI factors in relation to opioid-related mortality. By identifying the variables which consistently rank as most important, this study highlights key social vulnerability factors that may play critical roles in the opioid crisis.
- North America > United States > Connecticut (0.05)
- North America > United States > Wisconsin > Menominee County (0.04)
- North America > United States > Kentucky > Bath County (0.04)
- (8 more...)